Haplotype Association Mapping by Density-Based Clustering in Case-Control Studies (Work-in-Progress)

نویسندگان

  • Jing Li
  • Tao Jiang
چکیده

Linkage disequilibrium (LD) mapping for complex diseases using haplotypes has been intensively studied recently due to increased availability of large-scale dense SNP (single nucleotide polymorphism) markers. Such an LD mapping has many applications, e.g. finding disease-associated haplotypes and predicting disease susceptibility (DS) gene loci from a whole genome scan. In this research, we develop a new algorithmic method for haplotype mapping based on a density-based clustering algorithm, and propose a new haplotype (dis)similarity measure. The mapping regards haplotype segments as data points in a high dimensional space. The DS gene embedded haplotypes, especially those mutants of recent origin, tend to be close to each other due to linkage disequilibrium, while other haplotypes can be regarded as random noise sampled from the haplotype space. Clusters are then identified using a density-based clustering algorithm. Pearson χ statistic or Z-score based on the numbers of cases and controls in a cluster can be used as an indicator of the degree of association between the cluster and the disease under study. The method does not require any assumptions about the evolutionary model or the inheritance patterns of the disease. The proposed similarity measure is a generalization of several haplotype similarity measures currently used in the literature. It is robust against recent mutations/genotype errors and recombination events. Preliminary experimental results on an independent simulated data set, including both SNP markers and microsatellite markers, and on a real data set with the known DS gene location for type 1 diabetes show that our method could predict gene locations with high accuracy, even when the rate of phenocopies is high. This work is still in progress and more data are going to be tested.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Haplotype-based linkage disequilibrium mapping via direct data mining

MOTIVATION With the availability of large-scale, high-density single-nucleotide polymorphism markers and information on haplotype structures and frequencies, a great challenge is how to take advantage of haplotype information in the association mapping of complex diseases in case-control studies. RESULTS We present a novel approach for association mapping based on directly mining haplotypes (...

متن کامل

Density-based clustering in haplotype analysis for association mapping

Clustering of related haplotypes in haplotype-based association mapping has the potential to improve power by reducing the degrees of freedom without sacrificing important information about the underlying genetic structure. We have modified a generalized linear model approach for association analysis by incorporating a density-based clustering algorithm to reduce the number of coefficients in t...

متن کامل

Genome-wide association studies using haplotype clustering with a new haplotype similarity.

Association analysis, with the aim of investigating genetic variations, is designed to detect genetic associations with observable traits, which has played an increasing part in understanding the genetic basis of diseases. Among these methods, haplotype-based association studies are believed to possess prominent advantages, especially for the rare diseases in case-control studies. However, when...

متن کامل

A clustering approach for mineral potential mapping: A deposit-scale porphyry copper exploration targeting

This work describes a knowledge-guided clustering approach for mineral potential mapping (MPM), by which the optimum number of clusters is derived form a knowledge-driven methodology through a concentration-area (C-A) multifractal analysis. To implement the proposed approach, a case study at the North Narbaghi region in the Saveh, Markazi province of Iran, was investigated to discover porphyry ...

متن کامل

Association mapping by generalized linear regression with density-based haplotype clustering.

Haplotypes of closely linked single-nucleotide polymorphisms (SNPs) potentially offer greater power than individual SNPs to detect association between genetic variants and disease. We present a novel approach for association mapping in which density-based clustering of haplotypes reduces the dimensionality of the general linear model (GLM)-based score test of association implemented in the Hapl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004